Abdominal CT metrics in 17,646 patients reveal associations between myopenia, myosteatosis, and medical phenotypes: a phenome-wide association study

Summary Background Deep learning facilitates large-scale automated imaging evaluation of body composition. However, associations of body composition biomarkers with medical phenotypes have been underexplored. Phenome-wide association study (PheWAS) techniques search for medical phenotypes associated with biomarkers. A PheWAS integrating large-scale analysis of imaging biomarkers and electronic health record (EHR) data could discover previously unreported associations and validate expected associations. Here we use PheWAS methodology to determine the association of abdominal CT-based skeletal muscle metrics with medical phenotypes in a large North American cohort. Methods An automated deep learning pipeline was used to measure skeletal muscle index (SMI; biomarker of myopenia) and skeletal muscle density (SMD; biomarker of myosteatosis) from abdominal CT scans of adults between 2012 and 2018. A PheWAS was performed with logistic regression using patient sex and age as covariates to assess for associations between CT-derived muscle metrics and 611 common EHR-derived medical phenotypes. PheWAS P values were considered significant at a Bonferroni corrected threshold (α = 0.05/1222). Findings 17,646 adults (mean age, 56 years ± 19 [SD]; 57.5% women) were included. CT-derived SMI was significantly associated with 268 medical phenotypes; SMD with 340 medical phenotypes. Previously unreported associations with the highest magnitude of significance included higher SMI with decreased cardiac dysrhythmias (OR [95% CI], 0.59 [0.55–0.64]; P < 0.0001), decreased epilepsy (OR, 0.59 [0.50–0.70]; P < 0.0001), and increased elevated prostate-specific antigen (OR, 1.84 [1.47–2.31]; P < 0.0001), and higher SMD with decreased decubitus ulcers (OR, 0.36 [0.31–0.42]; P < 0.0001), sleep disorders (OR, 0.39 [0.32–0.47]; P < 0.0001), and osteomyelitis (OR, 0.43 [0.36–0.52]; P < 0.0001). Interpretation PheWAS methodology reveals previously unreported associations between CT-derived biomarkers of myopenia and myosteatosis and EHR medical phenotypes. The high-throughput PheWAS technique applied on a population scale can generate research hypotheses related to myopenia and myosteatosis and can be adapted to research possible associations of other imaging biomarkers with hundreds of EHR medical phenotypes. Funding 10.13039/100000002National Institutes of Health, Stanford AIMI-HAI pilot grant, Stanford Precision Health and Integrated Diagnostics, 10.13039/100019607Stanford Cardiovascular Institute, Stanford Center for Digital Health, and Stanford Knight-Hennessy Scholars.


Introduction
][10] More importantly, the application of automated segmentation methods to clinical CT scans may "opportunistically" improve patient care by quantifying image metrics beyond those in the original reason for examination.Identifying patients at risk for myopenia and myosteatosis on CT scans obtained for other reasons may lead to early interventions with diet and exercise, and in the future with pharmacologic therapy, thereby reducing morbidity and mortality.
As in many other diseases, research on myopenia has benefited from genome-wide association studies that enable the identification of single nucleotide variants associated with myopenia. 11Research using genome-wide association studies has led to research using phenome-wide association studies (PheWAS) that search for phenotypes associated with specific single nucleotide variants. 12Hundreds of PheWAS, mostly focused on genomic biomarkers, have been previously carried out 13 ; however, this methodology has not been used to study myopenia and myosteatosis.
Recently, in place of genotypes, PheWAS methodology has been adapted to study imaging metrics or laboratory measurements for deriving novel hypotheses about disease associations in large populations. 8,14,15To our knowledge, no studies have used PheWAS methodology to study imaging markers of myopenia or myosteatosis.In this study, we apply PheWAS methodology to integrate large-scale analysis of CT muscle metrics and EHR data.The aim of the study was to systematically determine the association of abdominal CT-based skeletal muscle metrics with medical

Research in context
Evidence before this study We searched PubMed and the Cochrane Library from database inception to October 22, 2023 using keywords "phenomewide association studies (PheWAS)" AND "muscle", "myopenia", "myosteatosis", "body composition", "computed tomography (CT)", or "diagnostic imaging" with no language restrictions.Previous PheWAS studies have evaluated the association of genetic features with observable phenotypes, and fully automated approaches can now combine deep learning-based computer vision analysis of routinely acquired medical images with known EHR-based medical phenotypes.Although PheWAS methodology can now be used to study the association between imaging metrics and disease phenotypes, no studies have used PheWAS methodology to study imaging markers of the largest protein reservoir in the human body, skeletal muscle.

Added value of this study
In this study, we apply PheWAS methodology to integrate large-scale analysis of CT muscle metrics and EHR data to determine the association of abdominal CT-based skeletal muscle of mass (skeletal muscle index [SMI], used to assess for myopenia) and quality (skeletal muscle density [SMD], used to assess for myosteatosis) with medical phenotypes in a retrospective cohort of 17,646 North American adults.Of 611 medical phenotypes, 213 were significantly associated with both SMI and SMD.Both the effect size and statistical significance for each association of SMI and SMD with the EHR-derived medical phenotypes are determined and supported with multimedia interactive displays after Bonferroni correction.We identify the presence and magnitude of previously unreported phenotype associations for both SMI and SMD.For example, we find previously unreported associations of higher SMI with decreased cardiac dysrhythmias (OR, 0.59; P < 0.0001) and decreased epilepsy (OR, 0.59; P < 0.0001).We also find expected associations that have been reported in the literature and are validated by our PheWAS approach, such as higher SMI associated with decreased protein-calorie malnutrition (OR, 0.30 [0.28-0.32];P < 0.0001) and higher SMD associated with decreased morbid obesity (OR, 0.28 [0.26-0.31];P < 0.0001).

Implications of all the available evidence
Our study shows that deep learning can be fully automated to analyse routinely acquired CT scans for body composition alterations indicative of myopenia and myosteatosis, and this affords the opportunity to evaluate for a wide variety of disease associations at a population level.When significant associations are identified with PheWAS, hypothesisgenerating analysis can focus on distinct biological mechanisms of disease and specific causal effects of imaging biomarkers on disease phenotypes.With appropriate replication and validation in independent populations, the application of biological knowledge to diagnostic imaging in a standardized high-throughput PheWAS framework could enable accelerated evaluation of previously unreported strategies for disease prevention and management, such as pharmaceutical repositioning of existing drugs to treat diseases that share common pathophysiology.phenotypes using PheWAS methodology in a large North American cohort.

Overview
An overview of the study methodology is presented in Fig. 1.Briefly, consecutive abdominal CT examinations of adults along with their medical records were retrospectively collected.SMI and SMD metrics were obtained using a previously validated automated deep learning pipeline.Diagnosis codes were mapped to medical phenotypes.PheWAS methodology was used to identify associations between CT metrics and medical phenotypes.

Data sources
We identified consecutive patients over the age of 18 years who had intravenous contrast-enhanced abdominal CT scans at Stanford Health Care (an academic medical centre providing tertiary-level care) between December 2012 and October 2018.We excluded patients for whom no height information was available in the electronic health record (EHR), patients with height <100 cm, as well as any repeated scans on patients, such that only the earliest scan for each patient was included.From the original set of 28,261 CT scans, 8791 were excluded as repeat examinations of an already included patient, 4 from individuals with height <100 cm, and 1820 were excluded due to missing height information in the patient's EHR that is required for SMI calculation (Fig. 2).Information for all patients derived from the EHR included: age, sex, race and ethnicity, height, weight, body mass index (BMI), and International Classification of Diseases (ICD-9 and -10) codes.No EHR derived information was missing for the included patients in this study.
CT scans were performed on 17 multislice CT scanners made by four manufacturers (9918 scans on GE scanners, 7724 on Siemens, 2 on Philips, and 2 on Toshiba).Prior multi-institutional work with CT phantom evaluation of 67,392 American College of Radiology accreditation examinations has demonstrated that muscle density surrogates had only a small degree of change across CT scanners made by these four manufacturers. 16All scanners in this study underwent daily calibration using a quality assurance phantom, in accordance with the quality assurance specifications of the American College of Radiology, with contemporaneous daily recording of all calibration results.CT protocol parameters included a tube voltage mode of 120 kV (range, 70-140 kV), slice thickness mode of 1.25 mm (range, 0.5-5 mm), standard rotation time of 0.5s, effective current setting on the basis of body mass index, and variable contrast phase.We used a rule-based algorithm to identify the delayed contrast phase for each scan. 17

CT-derived muscle metrics
An automated pipeline that was previously validated at our institution was used to obtain CT-derived skeletal muscle metrics where an axial slice at the level of the third lumbar vertebra (L3) was selected using a deep learning algorithm consisting of a two-dimensional fully-convolutional neural network. 18This axial slice was then input into a second deep learning algorithm, a 2.5-dimensional convolutional neural network, which labelled pixels as muscle or non-muscle.Muscle crosssectional area in cm 2 and average radiodensity (SMD) in Hounsfield units (HU) were obtained from each image.The cross-sectional area was divided twice by the patient's height in meters to derive the skeletal muscle index (SMI [cm 2 /m 2 ]).Sex-specific mean and standard deviations of SMI and SMD were computed for males (n = 7492) and females (n = 10,154).Sex-specific zscores for SMI and SMD were created by normalizing each patient's metrics by subtracting the corresponding sex-specific mean and dividing by the sex-specific standard deviation.This tool was utilized given prior validation yielded average measurement errors <1% for both SMI and SMD. 18

PheWAS
A PheWAS was performed using previously described methodology. 19The association between sex-specific zscores of CT-derived SMI and SMD and phenotypes of interest was determined.The EHR ICD-9 and ICD-10 CM codes for each patient between one month prior to and three months following the time of CT acquisition were recorded.ICD-9 and -10 CM codes were grouped into phecodes, which are medical phenotypes defined by medical findings, symptoms, and diagnoses.To do so, each ICD-9 and ICD-10 CM code was mapped to a phecode using the Phecode Map v1.2. 20,21The Phecode Map includes medical phenotype groupings consisting of 16 categories: infectious diseases, neoplasms, endocrine/metabolic, hematopoietic, mental disorders, neurological, sense organs, circulatory system, respiratory, digestive, genitourinary, dermatologic, musculoskeletal, congenital anomalies, symptoms, and injuries & poisonings.Phecodes with a count of 2 or more for an individual corresponded to a case for that medical phenotype and a count of 0 corresponded to a control; patients with a phecode count of 1 were not included in the association study for that medical phenotype.1206 phecodes with fewer than 100 cases or controls were excluded from analysis.Logistic regression using patient age and sex as covariates was used to determine associations between SMI and SMD z-scores and each phecode.
Nonlinear associations between SMI, SMD and medical phenotypes were also explored.Specifically a nonlinear term (SMI*SMI and SMD*SMD) was included and assessed for significance at the Bonferroni corrected threshold.For significant associations of the nonlinear term, likelihood ratio tests were used to determine whether the model containing the nonlinear term is preferred to the linear only model.
The code to reproduce the PheWAS analysis is available at the following website: https://github.com/StanfordMIMI/muscle_phewas/muscle_phewas.R.

Statistical analysis
Statistical analyses were performed in R version 4.1.1(R Core Team).Demographic characteristics of the study population were compared using the Welch's t-test for continuous variables, and the two-proportion z-test for proportions using a level of significance α = 0.05.SMI and SMD were compared across age and sex using bootstrapped two-factor ANOVA by sampling with replacement 1000 times, due to heteroskedasticity as determined by Levene's test.PheWAS analyses were performed with the R PheWAS package, version 0.99.5-5. 19PheWAS P values were considered significant at the Bonferroni corrected threshold (0.05/1222)  and are reported as Bonferroni corrected P values (P value * 1222).As an additional criteria for significance, associations were considered significant if, in addition to being significant at the Bonferroni corrected threshold, the value 1.0 was not present in the 95% confidence interval of the association's odds ratio.Associations meeting both criteria were considered significant.

Role of the funding source
The funding sources used in this study did not have any involvement in the study design, collection, analysis, interpretation of data, writing of the report or the decision to submit the paper for publication.

Ethics
This study was approved by Stanford University institutional review board (IRB-58903) which waived of informed consent given minimal risk to subjects.The study complied with the stipulations set forth by the institutional review board.

PheWAS analysis
PheWAS analysis included 611 medical phenotypes.Table S1 shows the full list of medical phenotypes studied and the magnitude of association to CT muscle metrics.CT-derived SMI was significantly associated (P < 0.05/1222) with 268 medical phenotypes and SMD with 340 medical phenotypes.213 medical phenotypes were significantly associated (P < 0.05/1222) with both SMI and SMD.All associations significant at the Bonferroni corrected threshold were considered significant based on the associated odds ratio confidence interval.For SMI and SMD, 143 and 122 associations met the confidence interval criterion but did not meet the Bonferroni threshold criterion.
Fig. 4 shows a Manhattan plot describing the significance of the association of CT-derived muscle metrics and medical phenotypes.The medical phenotype group with the highest proportion of statistically significant associations was hematopoietic: 75% (18/24)   and 92% (22/24) for SMI and SMD, respectively.Fig. 5 shows an alternative visualization of the data using a Volcano plot, where both the effect size and statistical significance of each association are shown.Interactive versions of Figs. 4 and 5 are available at the following website: https://github.com/StanfordMIMI/muscle_phewas/blob/main/muscle_phewas.R.This multimedia supplementary material also supports interactive display of odds ratios for the associations between SMI and SMD and ICD-derived medical phenotypes.
Table 4 shows previously unreported associations between CT-derived muscle metrics and medical phenotypes, which to our knowledge have not been previously described in the literature.For  Among the nonlinear terms, 29 SMI and 81 SMD were statistically significant at the Bonferroni threshold.In all such cases the model containing the nonlinear term was preferred.Table S2 summarizes for each of these associations whether they had been determined statistically significant in the linear only model, and whether in the nonlinear model the linear, square or both terms were found to be significant.Among the 29 nonlinear SMI models, 8 associations that were not significant in the linear only model were found to be significant in the nonlinear model.Twelve of the 81 SMD associations were not significant in the linear only model but were significant in the nonlinear model.The remaining associations had been previously identified as significant in the linear only analyses.Table S3 includes the nonlinear associations where the nonlinear model was preferred.

Discussion
Our PheWAS study used abdominal CT scans of 17,646 patients acquired during routine clinical practice to reveal previously unreported associations between CTderived biomarkers of myopenia and myosteatosis and EHR-derived medical phenotypes.In addition to providing body composition distributions for SMI and SMD in a large North American cohort, we validate prior findings that illustrate their variability across age and sex subgroups.Furthermore, we validated several known associations of SMI and SMD with medical phenotypes.The tools we provide to analyse and visualize these associations facilitate further hypothesis generation in studies of CT-derived metrics and their relation to medical phenotypes.
We identified previously unreported, medical phenotype associations for both SMI and SMD.For example, with increased SMI, we showed associations with decreased odds of supraventricular premature beats and other specified cardiac dysrhythmias.Dualenergy X-ray absorptiometry-derived metrics of myopenia linked with other specific types of cardiac dysrhythmia, such as premature ventricular contraction in individuals with normal body weight and atrial fibrillation among obese individuals, 29 CI, confidence interval.conductivity.We showed these associations extend to other types of dysrhythmias and are also relevant in CTbased metrics.Increased SMI was also significantly associated with decreased epilepsy phenotype.Skeletal muscle is increasingly recognized as a metabolically active tissue that releases endocrine-like substances that affect other organs, including the brain. 30Examples of muscle-brain interactions previously described include muscle-secreted myokines and neurotrophic factors that can stimulate synaptic plasticity, promote neurogenesis, and improve cognition.People with epilepsy have been previously identified as being less physically active, with decreased aerobic endurance, muscle strength and physical flexibility. 31Our findings suggest people with epilepsy are also more likely to have decreased SMI, a surrogate of myopenia.The potential causality and direction of this association can be studied in future work.These examples illustrate the ability of PheWAS analyses to generate new evidence for associations between CT-derived metrics and EHR-derived medical phenotypes.Future studies will be needed to confirm the previously unreported associations of CT-derived muscle metrics with cardiac and brain disorders, and to identify the potential biologic causal mechanisms underlying the associations.Our PheWAS analysis also validated medical phenotype associations with SMI and SMD that were previously reported in the literature.For example, we showed an association between protein-calorie malnutrition and decreased SMI, in accordance with prior studies showing malnutrition as one of the primary causes of myopenia. 22We also showed morbid obesity associated with increased SMI and inversely associated with SMD (OR [95% CI]: 2.42 [2.24-2.62]for SMI, 0.28 [0.26-0.31]for SMD).This is consistent with a previously-described increase in lean body mass among patient with obesity, with an accompanying decrease in SMD. 32,33These examples illustrate the power of PheWAS analysis to confirm known associations between CT-derived metrics and diseases.
There are several additional implications of our results.Our study shows the value of leveraging routine CT exams opportunistically at a population scale.In the future, such automated analysis of routine CT scans may improve patient care.Identification of CT biomarkers of myopenia and myosteatosis and their associations with diseases may provide value to many different types of clinicians.For example, urologists may be interested in CT screening, since our findings show that in males elevated prostate-specific antigen (PSA) is associated with increased SMI (OR [95% CI] 1.84 [1.47-2.31]).Increased testosterone levels may increase both SMI and PSA.Alternatively, rates of low serum testosterone and prostate cancer increase with age, as do rates of low muscle mass. 345][36] Our study shows increased muscle mass with obesity, increasing the potential complexity of this association.Further studies may evaluate whether a different threshold for increased prostate-specific antigen should be used when screening patients to account for potential confounding effects of varying muscle composition.We encourage physicians and other researchers with other specific interests to explore further associations of SMI, SMD, and specific medical phenotypes in Tables S1 and S3 and our interactive Figs. 4 and 5 available on our website.Given our findings, different study designs such as case-control trials may be performed to assess the previously unreported PheWAS associations.Additionally, future basic science research may aim to establish potential causal mechanisms between new and validated associations, such as those identified by our study for myopenia with cardiac dysrhythmias and epilepsy.Our study has several limitations.PheWAS analyses can identify many associations between imaging biomarkers and medical phenotypes, however, subsequent studies with alternative study designs should be carried out to study the nature of newly described associations, which may or may not be causal and may result from unobserved confounding.Additionally, such future studies may also clarify the role of adjustment variables as confounders or covariates for a specific association.Further, this was a retrospective analysis of data from a single institution.The associations between CT metrics and diagnoses may be biased in this cohort of patients undergoing abdominal CT examination during routine practice.However, there is pragmatic value in the highthroughput opportunistic analysis of already-acquired CT scans, and our cohort is representative of patients undergoing CT examinations in routine practice.Furthermore, although our cohort is large and contains individuals of diverse racial and ethnic backgrounds (Asian, Hispanic, and White), Black and Native American patients are underrepresented.In addition, we could not reliably identify individuals receiving genderaffirming therapy due to possible limitations in the documentation of sex vs gender in our EHR.Given that gender-affirming therapy may impact body composition, 37 subsequent studies should evaluate the potential of such therapy on CT-derived muscle metrics and associated medical phenotypes.Furthermore, our methodology for identifying medical phenotypes relies on ICD codes, which may not precisely capture medical phenotypes since they are used generally for billing and quality improvement.We used this approach since it has been extensively validated in prior PheWAS work outside of radiology.Future studies may explore different, less-established medical phenotype definitions, such as deep phenotyping or laboratory exambased phenotypes. 14,38n summary, our study illustrates the power of PheWAS analysis for broad radiology research in general and for myopenia and myosteatosis research in particular.We found evidence for previously unreported associations between CT-derived markers of myopenia and myosteatosis with medical phenotypes.In addition to facilitating hypothesis generation, this study paves the way for other imaging-based PheWAS studies exploring the associations of imaging biomarkers with medical phenotypes on a population scale.

Fig. 3 :
Fig. 3: Distributions of SMI (a) and SMD (b) stratified by age decile and sex illustrate differences across subgroups.In each age group the lower black box edge, white circle, and upper black box edge mark the 25th, 50th, and 75th percentile, respectively.SMI, skeletal muscle index; SMD, skeletal muscle density.

Fig. 4 :
Fig. 4: Manhattan plot for PheWAS analysis studying associations between SMI (a) and SMD (b) and ICD-derived medical phenotypes.The x-axis shows the medical phenotype groups evaluated; the y-axis shows the magnitude of significance as measured by the negative logarithm of the P value.The horizontal red line corresponds to the Bonferroni significance threshold (0.05/1222).The direction of the markers indicates the direction of the association (e.g., marker pointing up indicates higher odds with increased muscle metric).SMI, skeletal muscle index; SMD, skeletal muscle density; ICD, International Classification of Diseases.

Fig. 5 :
Fig. 5: Volcano plot of PheWAS associations between SMI (a) and SMD (b) and ICD-derived medical phenotypes.Statistically significant associations are marked in red.The horizontal red bar corresponds to the Bonferroni significance threshold (0.05/1222).The 5 medical phenotypes with highest and lowest magnitude of association (|Log Odds Ratios|) are annotated.SMI, skeletal muscle index; SMD, skeletal muscle density; ICD, International Classification of Diseases.

Table 1
Welch's t-test]).The decreases in SMI and SMD with increasing age in men and women are shown in Table2and Fig.3.Table3lists the prevalence of medical phenotypes for the 16 diagnostic groups.Among grouped medical phenotypes, symptoms had the highest average prevalence (1343/17,270; 7.6%), whereas congenital anomalies had the lowest (128/17,576; 0.7%).

Table 2 :
Muscle CT-metrics according to age and sex groups.
suggests associations between skeletal muscle quantity and myocardial

Table 3 :
Prevalence of grouped medical phenotypes in study population.

Table 4 :
Top refers to those with highest magnitude of association without prior evidence in the literature.SMI, skeletal muscle index; SMD, skeletal muscle density; OR, sexand age-adjusted odds ratio; CI, confidence interval; n, number.Top 5 previously unreported associations between SMI (a) and SMD (b) and medical phenotypes.

Table 5 :
Top refers to those with highest magnitude of association and validated refers to those with prior evidence in the literature.SMI, skeletal muscle index; SMD, skeletal muscle density; OR, sex-and age-adjusted odds ratio; CI, confidence interval; n, number.Top 5 validated associations between SMI (a) and SMD (b) and medical phenotypes.